DNA sequence and structural properties as predictors of human and mouse promoters
نویسندگان
چکیده
Promoters play a central role in gene regulation, yet our power to discriminate them from non-promoter sequences in higher eukaryotes is mainly restricted to those associated with CpG islands. Here, we examined in silico the promoters of 30,954 human and 18,083 mouse transcripts in the DBTSS database, to assess the impact of particular sequence and structural features (propeller twist, bendability and nucleosome positioning preference) on promoter classification and prediction. Our analysis showed that a stricter-than-traditional definition of CpG islands captures low and high CpG count promoter classes more accurately than the traditional one. We observed that both human and mouse promoter sequences are flexible with the exception of the TATA box and TSS, which are rigid regions irrespective of association with a CpG island. Therefore varying levels of structural flexibility in promoters may affect their accessibility to proteins, and hence their specificity. For all features investigated, averaged values across core promoters discriminated CpG island associated promoters from background, whereas the same did not hold for promoters without a CpG island. However, local changes around - 34 to - 23 (expected position of TATA box) and the TSS were informative in discriminating promoters (both classes) from non-promoter sequences. Additionally, we investigated ATG deserts and observed that they occur in all promoter sets except those with a TATA-box and without a CpG island in human. Interestingly, all mouse promoter sets showed ATG codon depletion irrespective of the presence of a TATA-box, possibly reflecting a weaker contribution to TSS specificity in mouse.
منابع مشابه
Large-scale structural analysis of the core promoter in mammalian and plant genomes
DNA encodes at least two independent levels of functional information. The first level is for encoding proteins and sequence targets for DNA-binding factors, while the second one is contained in the physical and structural properties of the DNA molecule itself. Although the physical and structural properties are ultimately determined by the nucleotide sequence itself, the cell exploits these pr...
متن کاملDNA motifs in human and mouse proximal promoters predict tissue-specific expression.
Comprehensive identification of cis-regulatory elements is necessary for accurately reconstructing gene regulatory networks. We studied proximal promoters of human and mouse genes with differential expression across 56 terminally differentiated tissues. Using in silico techniques to discover, evaluate, and model interactions among sequence elements, we systematically identified regulatory modul...
متن کاملEvidence of genome-wide G4 DNA-mediated gene expression in human cancer cells
Guanine-rich DNA of a particular sequence adopts four-stranded structural forms known as G-quadruplex or G4 DNA. Though in vitro formation of G4 DNA is known for several years, in vivo presence of G4 DNA was only recently noted in eukaryote telomeres. Recent bioinformatics analyses showing prevalence of G4 DNA within promoters of human and related species seems to implicate G4 DNA in a genome-w...
متن کاملP-157: Polymorphic Core Promoter GA-repeats Alter Gene Expression of The Early Embryonic Developmental Genes
Background: We examine the GA-repeat core promoters of MECOM and GABRA3 in human embryonic kidney-293 cell line and show that those GA-repeats have promoter activity,and those different alleles of the repeats can significantly alter gene expression.We propose a novel role for GA-repeat core promoters to regulate gene expression in the genes involved in development and evolution. Materials and M...
متن کاملIsolation and Sequence Analysis of GpdII Promoter of the White Button Mushroom (Agaricus bisporus) from Strains Holland737 and IM008
Many recent studies have shown that glycosylation patterns of Agaricus bisporus are similar to those of mammalians, so that this organism is a good candidate for the expression of glycosylated pharmaceutical protein. To achieve constant interested gene expression in all cells of the organism, proper promoter isolation is necessary. To isolate this promoter, PCR with specific primers was perform...
متن کامل